Metagenomic Analysis of the Airborne Environment in Urban Spaces
Overview The organisms that are present in aerosol microenvironments are prevalent factors in public health and detection of potential antigens, especially in highly populated urban areas. Aerosol microenvironments have not been well studied or characterized largely because of the difficulty associated with obtaining these microorganisms and the fact that these microenvironments are constantly changing. Previous studies have showed changes in these miroenvironments as a result of meteorological conditions, weather, and agricultural or urban environments through culture-based analysis. Many of these microorganisms are difficult to culture though, and so there has been increasing interest in metagenomic analysis to find out what pathogens could contact and potentially harm humans. Metagenomic studies done at waste facilities, and suburban areas indicate that their is a huge variety of factors indicating which aerosols are present at a certain place and time. There is currently a system in place called BioWatch which monitors ambient air in urban areas as an initial defense against potential pathogens. This metagenomic analysis used seasonal samples gathered from BioWatch filters in Washington DC and next-generation sequencing to perform whole metagenome sequencing of the samples. Method Samples were collected from eleven filters across the region every day for one week during each season; January, April, July, and October of 2009. Samples were also collected when Bacillus thuringiensis serovar kurstaki spores was actively dispersed as a pesticide (May 7, 2007). Filters from the same week were combined to one sample for extraction. The DNA from these samples was wholly extracted and purified. Prior to sequencing the samples were set into two independent data sets, each with four seasons each. The DNA samples then underwent sequencing with next-gen Illumina sequencing which fragments DNA, amplifies, and reads sequences based by fluorescent tags one base at a time. Genomic composition was determined by mapping the sequences with Bowtie, a short sequence aligner, allowing for differentiation between bacteria, viruses, fungi, and eukaryotic cells. All of the hits from the Bowtie run were kept to be analyzed, and characterized based on their taxonomic ID. It is also important to note that bacteria with multiple sub-strains were grouped together as to avoid the process of seperately distinguishing between each sub-strain of that organism. Results Total reads that correlated to a specific taxonomic ID were used to assign relative abundance of aerosols. The data was initially grouped by category of organism type and season (bacteria, fungi, plant etc.). This graph reinforced expected results that bacteria would be the primary organism type in the filter. The total distribution of data was prominently populated by bacteria which were at the highest levels during the summer, followed by the winter. This information seems counterintuitive,but was consistent with the hypotheses of the analysis. The number of informative reads was elevated in the spring thanks to the increased quantity of plant matter that comes with a much larger genome to be mapped. Plants overall were observed at expected levels, as spring was the most abundant followed by summer. The top 15-genomes, those with he highest year-round abundance were then measured based on total reads of their specific taxonomic ID. The majority of the top-ranked bacterial genomes were associated with normal soil microbiota. However, there were many species associated with skin normal flora such as Klebsiella and Staphylococcus species. Sequences for plant cells of fungi were also prevalent, and notably peaked during the spring as expected. Betula nana, a birch shrub, contributed a large chunk of plant sequences. Invertebrate genomes, primarily Aedes aegypti, were also characterized, and peaked during the summer, however levels were relatively very low throughout the year. There were also sequences associated with phages (synthetic), vertebrates, and other viruses, however the relative amounts were negligible throughout the year. This grouping also suggested that high abundance genomes didn't vary between seasons as much as less prevalent genomes did. Conclusion Airborne microbial communities are influential factors in both public health, and environmental maintenance. The majority of current identifications are done using PCR which only targets specific biological agents. This analysis showed the value of using high-throughput next-gen sequencing to holistically examine the taxa of an environment and to identify specific pathogens. The identification of B. thuringiensis serovar kurstaki served as a case study for identifying specific pathogens not necessarily because B. thuringiensis is a dangerous etiological agent, but because of its similarity to B. anthracis which if identified would cause national alarm. The main drawback to using a sequencing technique to identify specific pathogens is the time it takes to do the procedure. However if a pathogen were to be identified in a BioWatch filter, sequencing data could be gathered from other filters to provide a higher degree of information about the prevalence of the pathogen. This analysis though adds to our knowledge of the complexity of aerosol communities in urban spaces, and how they change between seasons. Knowing the microbial background of the community can allow us to monitor any significant changes in that environment an prepare for the potential effects of a dynamic environment. References Langmead B. (2010): Aligning short sequence reads with Bowtie Nicholas A, Thissen J. (2014): Metagenomic Analysis of the Aerosol Environment in Urban Spaces